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AssTRACT. The problems involved in estimating 
the product moment correlation coefficient with 
spatially autocorrelated observations are set out, 
and a Monte Carlo experiment to explore risks 
made in inferring from such estimates is des- 
cribed. A range of different spatial lattices were 
used to investigate the errors introduced into 
the standard error estimator of Fisher’s trans- 
formation of the correlation coefficient. The results 
of the experiment demonstrate that inference 
from correlation coefficient estimates in the pres- 
ence of spatial autocorrelation is hazardous. 


Introduction 


The range of correlation coefficients available 
for the estimation of relationships between 
variables on the basis of sets of observations 
is wide. They share the property of varying 
between plus and minus unity, with zero 
representing the absence of a relationship 
between a given pair of variables. In the 
case of interval scaled observations, the 
Pearson product moment correlation co- 
efficient is most frequently chosen; this 
does, however, require that the observations 
be relatively normal, since the coefficient 
is sensitive to major deviations from nor- 
mality (Kowalski 1972, p. 11). After Siegel 


(1956, p. 18-34), one could reach for 
a non-parametric alternative, in the hope 
of getting an unbiased estimate of the coef- 
ficient. Unfortunately, one requirement shar- 
ed by all correlation coefficients is that 
the observations from which they are calcu- 
lated be independent. 

In the case of a sample from a reak 
or hypothetical population intended to 
estimate the unknown population value: 
of a correlation coefficient, its null distri- 
bution may be altered (cf. Student 1914, 
Yule 1926). Unwin and Hepple conclude 
that “since the null distribution varies 
with each pair of series and their respective: 
autocorrelations, ... the inferential analysis. 
of correlation matrices, and associated multi-- 
variate statistics, such as factor analysis, 
becomes hazardous” (1974, p. 220). These: 
considerations apply, of course, to series. 
of observations in time or in space. Spatial 
autocorrelation is defined in the following: 
way: “If the presence of some quality in 
a county of a country makes its presence: 
in neighbouring counties moré or less likely, 
we say that the phenomenon exhibits spatial 
autocorrelation” (Cliff and Ord 1973, p. 1). 
Since individual observations convey in- 
formation about their neighbours, the number 
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‘of degrees of freedom of a given set of 
Observations may be reduced. This phen- 
-omenon has been expressed by Cliff and Ord: 
“In general, when similar values of the varia- 
ble tend to cluster in space, we say that 
the variable exhibits positive spatial auto- 
«correlation... In the more usual situation 
-of positive spatial autocorrelation, we face 
‘the problem that an observation carries 
Jess information than an independent obser- 
vation, since it is partly predictable from 
meighbouring observations” (1975a, p. 725). 

The problem considered below concerns 
the estimation of the product moment cor- 
relation coefficient from series of obser- 
vations which are positively spatially auto- 
correlated. An attempt is made to estimate 
the values of the coefficient, and their dif- 
ferences from the values expected with 
strictly independent observations, thus yield- 
ing empirical estimates of the null distri- 
‘butions of the coefficients, for a number 
of spatial lattices, and a range of autocorre- 
lated series, 


Estimation 


When the correlation coefficient describing 
the relationship of two series is unknown, 
it is possible to estimate it without the 
further use of statistical significance tests. 
When the observations form a “sample” 
from a hypothetical population, it has 
been argued that hypothesis testing may 
be heuristically acceptable (Cliff 1973, for 
an opposing view, cf. Henkel 1976). Unfor- 
tunately, statistics estimating parameters 
using series of observations are known 
‘to be biased by autocorrelation; for example, 
ahe variance of a spatial series with positive 
Spatial autocorrelation will be biased down- 
‘wards using classical estimators (Unwin 
and’ Hepple 1974, p. 220, Cliff and Ord 


1975a, b, p. 300,. Bivand forthcoming). 
A further requirement concerns the choice of 
spatial scale, since it is known that spatial series 
can be altered or combined to yield very 
different estimates of the correlation between 
two variables (Yule and Kendal 1966, 
pp. 320-323, Openshaw 1977). 

The product moment correlation co- 
efficient is estimated as the covariance 
of two series of observations, x and y, 
divided by the product of their standard 
deviations: . 

tees, (a) 
\F(x) Fo) 


where Cov(x, y)= 


=@-D7 È &-DO-P, 
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Where the parameters of the process gov- 
erning the generation of a series are known, 
means and variances may be obtained using 
estimators given by Clif and Ord (1975a, 
p. 727). The notation adopted here reserves 
p for the process parameter, leaving r 
as the product moment correlation coefficient 
in the population, and ? as its estimator. 

Given an estimate of a population 
parameter, one is interested in its variance 
as a guide to inference about the value 
of the parameter. The construction of a con- 
fidence interval about the estimate allows 
us to test whether, for example, a parameter 
value of zero could fall within it, If the 
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distribution of the estimator is normal, 
the demarcation of the confidence interval 
is made easier (Wonnacott and Wonnacott 
1972, pp. 141 - 147). However, since the 
values of the product moment correlation 
coefficient are bounded, the distribution 
‘of its estimator is non-normal, but may be 
normalized using Fisher’s transformation 
{Yule and Kendall 1966, p. 497): 


z= in, r=tanh(z), 
iain, Fetter 
2 1-Fr 


The estimator of the square root of the 
variance of 2, the standard error, may be 
expressed as (David 1938, Kendall and 
Stuart 1963, pp. 390 - 391): 


Bova- 


(3) 
or 

62.5=V@-D'+44ro2@-D), (4) 
where ro is the hypothesized parameter 
value. Estimator (3) is recommended for 
n>50, and (4) for n>25, however, only 
(3) is used here. As may be seen, both depend 
on n, indicating that bias will be introduced 
by positive spatial autocorrelation, resulting 
in the underestimation of the standard 
error. This would lead to a narrower con- 
fidence interval being used for inference 
than is in fact justified by the information 
carried by the observations. In order to 
investigate the bias in the standard error 
estimator empirically, a Monte Carlo ex- 
periment was undertaken. 


Monte Carlo Study 


The purpose of the study was to estimate 
the values of 2 for a number of lattices, 
together with their biases and root mean 
square errors (Johnston 1972, p. 408). 


A simulation procedure was developed using 
samp es of n observations from a bivariate 
norma distribution. Spatial autocorre- 
lation was introduced using a first order auto- 
regressive model as has become accepted 
practice (Cliff and Ord 1973, p. 146 - 147, 
1975a, p. 730, Cliff, Martin and Ord 1974, 
p. 285, Martin 1974, p. 189): 


R 
X= P(x) È Wy Xe, (5) 
Y= Poy È, Wyeth, i=1,2,...,n, 
or in matrix form: 
x= Wr+E, 
(6) 


Y=Po) Wyte, 


where ¢,, ¢,~N(0, 1). 

Parameters pœ) and py, specify the overall 
degree of spatial autocorrelation in the lattice; 
w,, is an element of the matrix W of scaled 
weights, where: 


™ 


A 
wy=wpl p wij» 


1 if i and j are contiguous 
wj= neighbours, 


0 otherwise. 

It is clear that x and y may be generated by: 
2=(I— pu) W's, 

y=(I— py) WY z. 


Where is greater than zero, the transformed 
variable will contain z positively spatially 
autocorrelated observations. Since the cal- 
culation of the inverse matrix (I—pW)7, 
is inconvenient using standard procedures, 
a special inversion routine was written 
to invert the matrix using the power series 
of pW, since I-A) '= J, A*=I+4A +4? 
k=0 
+..., where 1>a,>0 (Hadley 1961, p. 
118-119). The series converges rapidly, 


(8) 
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especially with small p, reducing the time 
needed for inverting such large nxn and 
sparse matrices. 

The experiment was carried out in i the 
following way: 
1° Choose a lattice from the following: 

A — n=25, B=2.00, 5x5 regular lattice 
mapped onto a torus rook’s case, 

B — n=29, B=2.21, powiats of the 
former Poznań voivodeship, 

C — n=33, B=2.32, taxonomic econ- 
omic regions of Poland (Kukliński and Naj- 
grakowski 1976, p. 55), 

/D — n=49, B=2.00,7 x7 regular lattice 
mapped onto a torus, rook’s case, 

E — n=49, B=2.52, Polish voivodeships 
from 1th June 1975; 
where B=e/n, e=number of joins in the 
spatial system; 
2° Select the values of p to be used, here 
0.0, 0.5, 0.7, 0.9; 
3° Calculate the necessary matrices 
(I—pW)~ using the power series procedure; 
4° Generate 100 samples of [e¢] using the 
same random number stream from symnorm, 
a multivariate random number generator 
(Krzysko, Stolarski and Caliziski 1973). 
The parameter values for the means, co- 
variance matrix, and for z were u=(00)’, 


_f 1.0] 0.5 = i 
zfs] 10 „and z=0.5493; 


5° Choose a pair of values of p as Po) 
and po; 

6°- Calculate x and y using (8); 

7° Estimate the means and variances of the 

series, their covariance, correlation coef- 

ficient, and Z, repeating 6° and 7° 100 times; 

100 

8° Calculate the mean 2=—- § 2,, the 
100 z1 

mean bias 


100 


00 p=: 


and the root mean square error 
1 100 
RMSE=— J, (Z,-2)°; 
100 2 (z, ) > 
9° Return to 5° and choose a new combi- 
nation of values for pœ) and Py); 
10° Return to 1° and choose a new lattice. 


Results 


The results of the Monte Carlo experiment 
are tabulated by lattice, clearly indicating 
the influence of spatial autocorrelation 


TABLES 1-5. Resutts OF MONTE CARLO 


EXPERIMENT 
Lattice A 
n=25 5x5 f=2.00 
z= .5493 
O1Q=.-2132 
Pi» Po ? BIAS RMSE 
0 .0 5680 .0187 -2042 
0 3 5470 — .0023 .1919- 
0 7 5275 —.0218 1890 
0 3 4966 —.0527 .1929 
oS Be -5633 0140 .2092 
& sT .5596 .0103 -2190 
a 3 5453 — .0040 -2282 
$ kr d .5631 .0138 2362 
s$ 9 -5588 .0095 .2527 
2 9 .5651 0158 .2831 
Lattice B 
n=29 Poznań ĝ=2.21 
z=.5493 
Gr =.1961 
Puss Po z BIAS RMSE 
0 AC) -5686 .0193 1965 
0 F<) -5485 — .0008 .1944 
O°” @ .5152 —.0341 1957 
0 9 .4471 —.1022 2170 
5 a 5817 .0323 .2344 
rm at -5780 .0287 .2549 
5 32 5402 —.0091 -2779 
7 7 5887 0394 .2871 
ad. 9 5795 0302 3378 
9 > .6013 0520 4173 


Lattice. C 
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n=33 Taxonomic regions =2.32 
z= 5493 


31 =-1826 
Pia Din z BIAS RMSE 
0 0 -5463 —.0030 -1906 
0 > .5261 —.0232 .1888 
0 T 4923 —.0570 -1891 
(0 p 4223 —.1270 2116 
5 e .5571 0078 .2146 
5 ri 5483 —.0010 -2274 
5 Pp 5032 —.0461 2432 
mY ot 5613 -0120 -2063 
7 .9 5400 — .0093 .2920 
9 D> .5703 0210 -3805 
Laitice D 
n=49 7x7 p=2.00 
z=.5493 
Gy G=.1473 
Pix) Poy z BIAS RMSE 
Oo .0 .5587 .0094 -1590 
.0 e] 5391 —.0102 1517 
.0 7 -5133 —.0360 -1521 
0 2 -4628 — .0865 -1677 
5 5 -5688 -0195 -F72 
5 od 5659 0166 -1809 
Ps 2 5409 —.0084 -1941 
a? at -5781 .0288 -1970 
I K .5729 0236 .2235 
9 P .5974 .0481 .2628 
Larrice E 
n= 49 Voivodeships f=2.52 
z=.5493 
633 =.-1473 
pos. Do z BIAS RMSE 
0 0 5587 0094 -15090 
0 5 5404 —.0089 -1526 
0 a -5073 —.0420 F952. 
0 8 -4339 —.1154 .1880 
re 35 -5680 .0187 .1760 
5 si -5638 .0145 .1907 
53 .9 -5212 —.0281 2136 
id 7 5748 .0255 2184 
af 2 -5602 -0109 .2569 
9 3 -5868 .0375 -3270 


on the estimates of the standard error of 2, 
shown in the columns of root mean square: 
error values. It is obvious that the rejection’ 
of a null hypothesis that the parameter: 


'z=0.0 may occur through the use of the 


biased estimator G13, where the autocor- 
relation of the spatial series is marked. 
Bias in the estimatcr Z seems to stem rather 
from differerc2s in the processes generating 
the series of observations. 

A further point of interest in the bias of” 
the standard error estimator is that it appears 
to vary from lattice to lattice, confirming 
the view expressed by Unwin and Hepple 
cited above. For irregular lattices with both. 
series strongly autocorrelated, the width 
of the confidence interval given using the 
estimator §'& would seem to be about 
half that necessary to test the null hy— 
pothesis with an approximately correct level 
of significance. For the Polish voivodeships. 
lattice, n=49, B=2.52, 6,35=0.1473, 
RMSEp.5, 0.9=0.3270, and «#=0.05, the 
confidence interval would be 7+1.96x 
x 0.1473= 7 +0.2887, where the confidence- 
interval corresponding to a 5% significance 
level using the estimated standard error 
is in fact 2 +1.96 x 0.3270=2+0.6509. The 
confidence interval using 6,5 here thus 
represents a significance level not of 5% 
but of 37%. This seems to demonstrate: 
conclusively the dangers of inferring from 
the product: moment correlation coefficient 
when the observations may be spatially auto- 
correlated. 

Dr. Roger Bivand, Institute of Geography. 
Adam Mickiewicz University, Fredry 10, 
61-701 Pozrnai, Poland 
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